Continuous Word Recognition Based onthe Stochastic Segment Model
نویسندگان
چکیده
This paper presents an overview of the Boston University continuous word recognition system, which is based on the Stochastic Segment Model (SSM). The key components of the system described here include: a segment-based acoustic model that uses a family of Gaussian distributions to characterize variable length segments; a divisive clustering technique for estimating robust context-dependent models; and recognition using the N-best rescoring formalism, which also provides a mechanism for combining diierent knowledge sources (e.g. SSM and HMM scores). Results are reported for the speaker-independent portion of the Resource Management Corpus, for both the SSM system and a combined BU-SSM/BBN-HMM system.
منابع مشابه
A Comparison of Trajectory and Mixture Modeling in Segment-based Word Recognition
This paper presents a mechanism for implementing mixtures at a phone-subsegment (microsegment) level for continuous word recognition based on the Stochastic Segment Model (SSM). We investigate the issues that are involved in trade-oos between trajectory and mixture modeling in segment-based word recognition. Experimental results are reported on DARPA's speaker-independent Resource Management co...
متن کاملImprovements in the Stochastic Segment Model for Phoneme Recognition
The heart of a speech recognition system is the acoustic model of sub-word units (e.g., phonemes). In this work we discuss refinements of the stochastic segment model, an alternative to hidden Markov models for representation of the acoustic variability of phonemes. We concentrate on mechanisms for better modelling time correlation of features across an entire segment. Results are presented for...
متن کاملStochastic trajectory model with state-mixture for continuous speech recognition
The problem of acoustic modeling for continuous speech recognition is addressed. To deal with coarticulation effects and interspeaker variability, an extension of the Mixture Stochastic Trajectory Model (MSTM) is proposed. MSTM is a segment-based model using phonemes as speech units. In MSTM, the observations of a phoneme are modeled by a set of stochastic trajectories. The trajectories are mod...
متن کاملImprovement in n-best search for continuous speech recognition
In this paper, several techniques for reducing the search complexity of beam search for continuous speech recognition task are proposed. Six heuristic methods for pruning are described and the parameters of the pruning are adjusted to keep constant the word error rate while reducing the computational complexity and memory demand. The evaluation of the effect of each pruning method is performed ...
متن کاملContinuous speech recognition with a TF-IDF acoustic model
Information retrieval methods are frequently used for indexing and retrieving spoken documents, and more recently have been proposed for voice-search amongst a pre-defined set of business entries. In this paper, we show that these methods can be used in an even more fundamental way, as the core component in a continuous speech recognizer. Speech is initially processed and represented as a seque...
متن کامل